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Bayesian Estimation of Smooth Altimetric 
Parameters: Application to Conventional and 
Delay/Doppler Altimetry 

Abderrahim Halimi^, Corinne Mailhes^, Jean-Yves Tourneret^ and Hichem 

Snoussi ^ 


Abstract 

This paper proposes a new Bayesian strategy for the smooth estimation of altimetric parameters. 

The altimetric signal is assumed to be corrupted by a thermal and speckle noise distributed according 
to an independent and non identically Gaussian distribution. We introduce a prior enforcing a smooth 
temporal evolution of the altimetric parameters which improves their physical interpretation. The 
posterior distribution of the resulting model is optimized using a gradient descent algorithm which 
allows us to compute the maximum a posteriori estimator of the unknown model parameters. 

This algorithm presents a low computational cost which is suitable for real time applications. 

The proposed Bayesian strategy and the corresponding estimation algorithm are validated on both 
synthetic and real data associated with conventional and delay/Doppler altimetry. The analysis of real 
Jason-2 and Cryosat-2 waveforms shows an improvement in parameter estimation when compared 
to the state-of-the-art estimation algorithms. 

Index Terms 

Altimetry, Bayesian algorithm, smooth estimation, delay/Doppler altimetry, SAR altimetry, 
coordinate descent algorithm, natural gradient. 

I. Introduction 

A satellite altimeter is a nadir viewing radar that emits regular pulses and records the 
travel time, the magnitude and the shape of each return signal after reflection on the Earth’s 
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surface. This reflected echo provides information about many parameters such as the range 
between the satellite and the observed scene (denoted by r), the significant wave height 
(denoted by SWH) and the wind speed (related to the echo’s amplitude P„). However, 
altimetric waveforms are corrupted by speckle noise that reduces the quality of the retrieved 
geophysical parameters. Therefore, many recent studies and missions have been focusing 
on improving the quality of these estimated parameters by reducing the noise effect. We 
distinguish between two approaches that have been focusing on improving the altimetric 
technology or improving the processing of the available data. The first approach is based on 
reducing the measurement noise of conventional altimeters (such as Topex and Poseidon-3) 
by increasing the number of observations (looks). This can be achieved by keeping the same 
conventional technology and improving the altimeter characteristics (as for AltiKa [|T|-0), 
or by using a new delay/Doppler processing Q that results in a different echo’s shape Q- 
Q. The second approach processes the available data by using improved models for the 
altimetric echoes [|8|-pT| or improved estimation algorithms 
Following this second approach, the goal of this paper is to elaborate a new Bayesian 
model and its estimation algorithm accounting for the smooth evolution of altimetric signals. 


Usual methods for parameter estimation are based on maximum-likelihood [15| or weighted 
p^ , and unweighted least squares (LS) approaches [[^, Q, 0. Most existing al¬ 
gorithms consider altimetric echoes independently and estimate one set of parameters per 
echo. However, it is well known that adjacent echoes share similar characteristics because 


of their physical nature. This correlation has been considered in [12| which processed a set 
of echoes by imposing a spline evolution for the epoch parameter r. However, this method 


was developed by fixing the significant wave height for the whole set of echoes. In |13|, the 
authors proposed a three-step algorithm that reduces the range noise by first estimating three 
altimetric parameters (SWH, r and Py), then filtering the estimated SWH using a Gaussian 
filter and finally re-estimating the epoch r while fixing SWH. This last algorithm is flexible 
since it does not enforce the parameter SWH to be constant. However, it requires to use a 
sequential estimator whose computational cost can be prohibitive. Moreover, the algorithm 
of p3| requires the characterization of a Gaussian filter for SWH which has been chosen 


empirically. Also, it is interesting to note that the algorithms developed in [12|, [O have 
shown interesting properties especially in reducing the variance of the estimated altimetric 
parameters which encourages the consideration of the echo’s correlation to improve parameter 
estimation. 
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The first contribution of this paper is the elaboration of a hierarchical Bayesian model that 
allows smooth estimation of altimetric parameters by considering the correlation between 
successive altimetric echoes. Each altimetric echo is assumed to be corrupted by an additive, 
independent and non identically Gaussian noise p7| whose mean is the thermal noise. This 
noise model generalizes the independent and identically distributed (i.i.d.) Gaussian noise that 
is generally assumed when considering an un-weighted LS approach Q. The proposed 
approach allows the estimation of the noise parameters from the data which brings additional 
robustness compared to existing algorithms and allows the processing of different kinds of 
data. The parameters of interest of the proposed model (i.e., SWH, r, P„) are assigned priors 
enforcing a smooth evolution between consecutive signals which improves their estimation. 
This prior is defined from the discrete Laplacian of the different parameters. It has shown 


increasing interest for many problems such as image deconvolution [18|, [19|, hyperspectral 


unmixing [20| and medical imaging [211. Note that the proposed Bayesian hierarchy is generic 
in the sense that it can be applied to both conventional altimetry (CA) and delay/Doppler 
altimetry (DDA). 

The second contribution of this paper is the derivation of an estimation algorithm for the 
altimetric and noise parameters associated with the proposed hierarchical Bayesian model. 
The minimum mean square error (MMSE) and maximum a posteriori (MAP) estimators of 
these parameters cannot be easily computed from the obtained joint posterior. This problem 
has been solved in by drawing samples distributed according to this posterior using 
Markov chain Monte Carlo (MCMC) methods. More precisely, an Hamiltonian Monte Carlo 


(HMC) algorithm was used for its good mixing properties for high-dimensional vectors [231. 
The HMC algorithm provided good estimation results and will be briefly recalled in this 
paper. However, the resulting HMC-based estimators have a high computational complexity 
which can be penalizing for real time applications. In this paper, we propose an algorithm 
that significantly reduces this computational cost by using a gradient descent approach. More 


precisely, a coordinate descent algorithm [20[, [24| is used to sequentially update the noise 
and altimetric parameters. This algorithm uses a natural gradient approach [ [25| , weighting 
the gradient direction by the Eisher information matrix and leading to a fast convergence^ 
The proposed Bayesian model and estimation algorithm are validated using synthetic and 


*This approach is also known as Fisher scoring jiot, 


26 


. It updates the parameters in a Riemannian space resulting in 


a fast convergence to a local minimum of the cost function of interest 1271, (28). 
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real echoes acquired during the Jason-2 and Cryosat-2 missions. The obtained results are 
very promising and confirm the accuracy of the proposed Bayesian model and its associated 
algorithm. 

The paper is organized as follows. Section [represents the different models used for CA and 
DDA. The proposed hierarchical Bayesian model and its estimation algorithm are introduced 
in Sections [^ and jlV} Section jVj is devoted to testing and validating the proposed technique 


using synthetic data with controlled ground truth. Section VI shows results obtained using 
real data resulting from Jason-2 and Cryosat-2 missions. Conclusions and future work are 


finally reported in Section VII 


II. Altimetric models 

This section introduces the altimetric models used for CA and DDA that are characterized 
by three altimetric parameters: the significant wave height SWH, the epoch r and the 
amplitude Pu- 


A. Conventional and Brown models 

In CA, the mean power of the altimetric echo s{t) is expressed by a convolution of three 
terms that are the probability density function (PDF) of the heights of the specular scatterers, 
the point target response (PTR) of the radaij^ and the flat surface impulse response (FSIR) 
as follows [| 8 |, [jni 


s{t) = FSIR(f) * PDF(f) * PTRr(f) 


with 


FSIR(f) = Pu exp 
PDF(f) = 


4c 


U {t- Ts) 


■y/^i 


exp 


TTO-., 


P 


PARrit) = 


sm TT 


't) 


2a2 

2 


ttP 


( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 


where as = t is the time, is the epoch expressed in seconds, c is the speed of light, 
T is the time resolution, 7 is an antenna beamwidth parameter, h is the minimum satellite- 
surface distance and f/(.) denotes the Heaviside function. This model has been considered for 


^Instead of the considered squared cardinal sine PTR, a measured PTR could also be used in CA as in 


29 


, 30 . 
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parameter estimation in many studies sueh as 0, l[T§. However, the numerieal eomputation 
of the double eonvolution ([T]) leads to a high eomputational eost when proeessing a large 
amount of altimetrie data. Therefore, a simplified analytieal model known as the Brown 
model (BM) is often eonsidered. The BM is obtained by assuming a eentered Gaussian 
approximation (with a standard-deviation equals to Up) for the squared eardinal sine PTR^, 
leading to [[§, 0 


s{t) 


2 


1 + erf 


-Ts- aal \ ' 
\/2crc /. 


exp 




(5) 


where 




/SWH\' 


+ 


( 6 ) 


and where erf (f) = ^ 


V 2c J 

dz stands for the Gaussian error funetion, a and ai being 


two known parameters (depending on the satellite and on the Gaussian approximation of 
the PTR). The model @ is elearly more attraetive than ([T]) for designing simple estimation 
algorithms. 


B. Delay/Doppler altimetry 

DDA aims at redueing the measurement noise and inereasing the along-traek resolution 
in eomparison with CA. The mean power of a delay/Doppler eeho ean be expressed as the 
eonvolution of three terms: the time/frequeney FSIR, the PDF of the heights of the speeular 
seatterers and the time/frequeney PTR as follows 0, 0, [jT^, [ [^ 


P(/, /) = FSIR(/, /) * PDF(/) * PTR(/, /) 


with 


PTR(/, /) = PTRr(/)PTR^(/), PTR^(/) = 


sm 


H) 


f 

TT — 
'' F 


where F is the frequeney resolution. The FSIR is given by 0 


p 

FSIR(/, q) = — exp 

71 


4c , , 

“A 


U{t- Ts) 


X [(pt,q+liF) - <Pt,qiF)] 


with 


a,q 


Js) = Re 


aretan 


Vg 


(7) 

( 8 ) 


(9) 

( 10 ) 


lp^{t-Ts) -yl 

where q = I,-- - ,Q, Q being the number of Doppler beams (Q = 64 for Cryosat-2), 
p{t) = y/hcd is the radius of the propagation eireles (iso-range eireles), Uq = ^fq is the 
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ordinate of the g* Doppler beam, fq = {q — 32.5) F is the g* Doppler frequeney (32.5 is 
introdueed to obtain a eentral beam in the ease where Q = 64), Vg is the satellite veloeity, A 
is the wavelength and Re (a:) denotes the real part of the eomplex number x (the interested 
reader is invited to eonsult [[^ for more details about this formula). The double eonvolution 
(|7]) leads to a 2-D delay/Doppler map (DDM). The analysis of the DDA waveform is based 
on a multi-look eeho obtained by applying Doppler proeessing (slant range correetion and 
multi-looking) to the DDM (see 0, [[^). The resulting multi-look delay/Doppler signal ean 
be written 

Q 

s{t) = ^P{t-5tqJq) (11) 

q=l 

where 5tq is the delay eompensation expressed in seeonds. Note finally that the diserete 
altimetrie eeho (for CA, BM, and DDA) is gathered in the veetor s = (si, • • ■ , sk)'^, where 
K = 128 gates and Sk = s {kT). 


C. Noise model 

The altimetrie eehoes are eorrupted by a speekle noise whose infiuenee is redueed by 
averaging, on-board the satellite, a sequenee of L eonseeutive eehoes. This operation reduces 
the noise variance by a/Z when assuming pulse-to-pulse statistical independence between the 
averaged echoe^ Considering the central limit theorem and using the fact that the averaging 
is conducted on a large number of echoes, the resulting noise sequence can be approximated 
by a Gaussian distribution. This approximation has been successfully adopted in the altimetrie 
community p 6 | , pT| , and is implicitly used in the well known least squares estimation 
algorithms 0, p^ , pA| , [34|. Therefore, this paper considers that the altimetrie waveform 
is corrupted by an additive Gaussian noise as follows 

(®m) “f Grri; With Sm ~ -A/” (yjJjra^Ki ^m) (12) 


where (resp. s^) is a (iT x 1) vector representing the mth observed echo (resp. alti- 
metric model), 0m = [ 6 *i(m), 6 ' 2 (m), 6 * 3 (m)] = [SWH(m), r(m), Pu(m)] is a (1 x 3) vector 
containing the 3 altimetrie parameters SWH, r, for the mth echo, is an independent 
and non identically Gaussian noise with mean vector representing the thermal noise 

(where 1 /^ is a P x 1 vector of 1 ), and diagonal covariance matrix = diag (cr^) with 
cr^ = (a^i, ■ ■ ■ , a (iC X 1) vector containing the noise variances of the mth echcj^ 

^This assumption is true for CA echoes while it is not valid for DDA echoes as explained in §• 0 
''Note that the noise variance varies with respect to the time instant k and the observed echo m. 
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Note that the thermal noise parameter is sometimes represented by an additive eonstant that 
is added to the altimetrie model (&m)- In the proposed model, this parameter has been 
ineluded in the noise sequenee of ( [T^ where it represents the noise mean jUm- The 

next seetion introduees the Bayesian model assoeiated with a set of M sueeessive eehoes 
eonsidered in this paper. 

III. Hierarchical Bayesian model 

This seetion introduees a hierarehieal Bayesian model for estimating the parameters of M 
sueeessive altimetrie eehoes. The Bayesian approaeh assigns prior distributions to the un¬ 
known parameters summarizing the prior knowledge about these parameters. More preeisely, 
if / (a;) denotes the prior distribution assigned to the parameter x, the Bayesian approaeh 
eomputes the posterior distribution of x using Bayes rule 

f{x\Y) = oc f{Y\x)f{x) (13) 

where oc means “proportional to” and f{Y\x) is the likelihood of the observation veetor 
Y. The veetor x is then estimated from this distribution by eomputing its mean (MMSE 
estimator) or its maximum (MAP estimator). The following seetions introduee the likelihood 
and the prior distributions eonsidered in this paper. The unknown parameters of the proposed 
model inelude the thermal noise represented by an (M x 1) veetor the {K x M) matrix 
A = [cr ,cr^] eontaining the noise varianees assoeiated with the eonsidered M eehoes, 
and the (M x 3) matrix gathering the 3 altimetrie parameters (SWH, r and P„) of the M 
eehoes. 


A. Likelihood 


The observation model defined in ( [T^ and the Gaussian properties of the noise sequenee 
yield 


/^m.) Sm) OC 




mk , 




(14) 


where x^, = Vm - - dm'^K, and x^ (0^) (resp. (©m)) has been denoted by Xr 


(resp. Sm) for brevity. Assuming independenee between the observations leads to 


M 


f{Y\&,^,A) oc Yl f{ym\®m,dm,'Sr 


(15) 


m=l 


where © = [ 61 , 62 , 63 ] is an (M x 3) matrix eontaining the altimetrie parameters of M eehoes, 
fjL = (/T-i, ■ ■ ■ , is an (M x 1) veetor eontaining the noise means and A 
is a {K X M) matrix eontaining the noise varianees. 
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B. Priors for the altimetric parameters 


The altimetric parameters are assigned priors enforcing a smooth evolution for successive 
waveforms. The considered priors constrain the derivatives of the altimetric parameters to 
have small values. This can be done by assigning a Gaussian distribution to the second 
derivative of the altimetric parameter 0* as follows 


/(0i|ej oc ^ 


M/2 


exp 


-^ 

2e2 " 


(16) 


for i G {1, • • ■ , 3}, where ej is an hyperparameter, || ■ || denotes the standard h norm such 
that 11 a; IP = x'^x, and D is the discrete Laplacian operator. This prior has been referred 
to as simultaneous autoregression (SAR) or conditional autoregression (CAR) in the image 


deconvolution context [ 181, [ 19|. It has also been used for different problems such as spectral 


unmixing of hyperspectral images [20|, medical imaging applications [211 or spectroscopy 
applications [ [35| . The main originality of this work is to consider this prior for the parameters 
of altimetric waveforms. 


C. Prior for the noise parameters 

The noise parameters introduced in ( [T^ are also estimated by the proposed estimation 
algorithm. The absence of knowledge about the noise means (which represent the thermal 
noise) and covariances can be issued by choosing a non informative priors for these pa¬ 
rameters. More precisely, we consider the following conjugate Gaussian prior with a large 
variance for the noise mean 

f{p\f^)^Af{OM,f^lM) (17) 

where Im is the (M x M) identity matrix, Om is an (M x 1) vector of 0 and is a fixed large 
variance {ff = 10^ in our simulations). Considering the noise covariance matrix, we have 
first to note that in the case of speckle noise, these variances depend on the echo’s shape at 
each instant, i.e., the variances change from one echo to the other. However, we are generally 
interested in parameters estimated with a frequency of 1 Hz, i.e., we consider that the physical 
parameters do not change significantly in 1 second. This assumption means that the shapes 
of successive echoes acquired within a time interval of 1 s do not change significantly and 
thus that the noise variances are almost the same for these echoes. Therefore, we consider 
that the noise variance is constant for each r = 20 successive echoed This means that 

^The pulse repetition frequency of Jason-2 is 2 kHz. The echoes are generally averaged to deliver a 20 Hz data, i.e., it 
delivers 20 echoes for each second of data. 
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(^ln-i)r+i,k = ■■■ = crl^^k for n G {1, • •' ,N}, with N = ^ (note that the general case 
is obtained by considering r = 1). After assuming prior independence between the noise 
variances the non informative scale prior of A is defined as 

N K 

/ (A) = n n <‘8) 

n=l k=l ^'T-r^k 

where Ir+(.) is the indicator of the set IR+. 


D. Hyperparameter priors 

The hyperparameters i G {1, ■ ■ ■ , 3} are assigned independent inverse gamma distribu¬ 
tions as follows 

ei ^Xg{aM (19) 


where ~ means “is distributed according to” and a*, bi are fixed hyperparameters that depend 


on the dynamic magnitude of the Ah altimetric parameter [351. 


E. Marginalized posterior distribution 

The proposed Bayesian model includes the parameters ©,^, A and the hyperparameters 
G {I,-- - ,3} which is summarized in the directed acyclic graph (DAG) displayed in 
Fig. The joint posterior distribution of this Bayesian model can be computed from the 
following hierarchical structure 

/ (0, M. A, e|y) K /(yle, , 1 . A)/ (e. a, e) ao) 


f(@,M,A,e) = (21) 

i=l 

where we have assumed a priori independence between the altimetric parameters and hyper¬ 
parameters. This distribution can be marginalized with respect to (w.r.t.) e as follows 

3 

/(0,,i,A|y) oc /(y|e,M,A)/(MlV'")/(A)n 

1=1 
3 

oc f(rl0, p, A)/ {p\g^) / (A) n / bi) (22) 

i=l 


f{^i\(^l)f{4\ai,bi)de^i 


where 


/ {ei\ai,bi) oc 


D0^f 

2 



— ai — Ml2 


(23) 
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The advantage of this marginalization is to get rid of the hyperparameter veetor e and thus 
to make the estimation proeedure more robust. However, even with this simplification, the 


MMSE and MAP estimators associated with the marginalized posterior ( |22| ) are not easy to 
determine mainly because of the nonlinearities relating the noiseless altimetric waveform 


and the parameters of interest (r and SWH). In [221, an MCMC algorithm was proposed to 
sample the joint posterior distribution of a similar Bayesian model. The generated samples 
were then used to compute the MMSE parameter estimators (see Appendix for more 
details about this algorithm). However, this MCMC algorithm has a high computational 
complexity which can be penalizing for real time applications. In this paper, we propose 
an alternative based on an optimization technique maximizing the marginalized posterior 
w.r.t. the parameters of interest (or equivalently, minimizing the negative log-posterior 


—log[p(©, A|Y)] defined in (|22l)). The mode of the posterior distribution ( |22| ) is classically 
referred to as marginalized MAP estimator of (©,^r, A). 


IV. Coordinate descent algorithm 

This section describes a gradient descent algorithm maximizing the marginalized posterior 
w.r.t. the noise and altimetric parameters, i.e., determining the marginal MAP estimator 
of (©,^, A). As explained before, maximizing ( |22| ) w.r.t. (©,^r,A) reduces to minimizing 
the negative log-posterior defined as (after removing unnecessary constants) 

K N M T-^-1 

c(e,^,A) = + + 

k=l n=l 



( 24 ) 


Because of the large number of parameters in (©,^, A), we propose a coordinate descent 


algorithm [20|, [24| that sequentially updates the different parameters. More precisely, the 
algorithm estimates the altimetric and noise parameters iteratively. In each step, the parameters 
of interest are estimated while holding the other parameters fixed. This process is repeated 
until the algorithm has converged to a local minimum of the cost function C(©, n, A). The 
main steps of the resulting algorithm are described in Algo. The next sections describe 
the sub-optimization procedures maximizing the cost function C(©, n, A) w.r.t. the altimetric 
and noise parameters. A convergence diagnosis is also provided to study the convergence of 
the proposed approach. 
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Algorithm 1 Gradient descent algorithm 


1: Initialization 

2: Initialize parameters 9f\ for i G {1, • • ■ ,3}, and t = 1 

3: conv= 0, 

4: Parameter update 

5: while conv= 0 do 

(i) 


6 : 

7: 

8 : 

9: 

10 : 


Update 6 \ , for z G {1, ■ ■ ■ , 3} according to ( |25] ) 
Update according to ( [3T| ) 

Update A^*^ according to ( [Mj ) 

Set conv= 1 if the convergence criteria are satisfied 
t = t + 1 


11 : end while 


1) Natural gradient for altimetric parameters: In order to ensure a fast estimation of 
the altimetric parameters, we propose to use the natural gradient algorithm to minimize C 


w.r.t. © [25|. This gradient algorithm is also known as Fisher scoring approach since it 
weights the gradient direction by the Fisher information matrix of the parameters p0| , p6| . 
The resulting descent algorithm updates the parameters in a Riemannian space resulting 


in a fast convergence to a local minimum of the objective function [27|, Denote as 

rri '"IP ___ 

the(3M X 1) vector gathering the altimetric parameters. The proposed 
algorithm updates this vector as follows 


^d+i) ^ ^(t) 


F-^ (^W) VC (7W) 


where VC (7*^*^) denotes the gradient of the cost function C given by 




dC 

Ml 


T 


dC 


dC 

Ml 


with 






89 


li 


89 


Mi 



(25) 


(26) 


(27) 


The Fisher information matrix F (7^*^) can be decomposed as the following block matrix 


F = 


Fi 

Fi2 

F 13 

F 12 

F2 

F23 

F 1‘i 

F23 

CO 


(28) 
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with 


Fij = diag 


F- = F- 

7 -7-7 


dsi 


de 


li 


de 


dsh 1 dsM 

Zj 


li 


’ ddMr^^ 09 


Mj 


M 

CLi -\- - 

2 


WPe^W 


+ hA D^D - (D^DOi) {D^De.y 


\\D0^\\ 


+ bi 


(29) 


Note that the derivatives required to eompute the elements of F are the same as those 
required to eompute VC. Therefore, the only additional eomputation w.r.t. a steepest deseent 
approaeh is the matrix inversion in ( |25] ). Note also that this matrix inversion is faster than a 
new evaluation of the altimetrie model neeessary for a new update of 7 in a steepest deseent 
algorithm. Therefore, it is better to add this inversion and reduee the number of iterations, 
instead of using a simple steepest deseent approaeh that usually requires a lot of iterations 
to eonverge resulting in a high eomputational eost. 

2) Updating the noise parameters: Maximizing the marginalized posterior distribution 
( |2^ w.r.t. the noise parameters (means and varianees) ean be aehieved by eonsidering a 
eoordinate deseent algorithm. This approaeh reduees to maximizing the eonditional distribu¬ 
tions assoeiated with eaeh noise parameter. Using ( [T5] ) and ( [TT] ), it ean be easily shown that 
the eonditional distribution of the noise mean is the following Gaussian distribution 

1 


dm \ymi®mi'Vjrni'^ ~ dV I pmi 


+ Efc=i 


mk , 


with 




^ y-mk ^mk 
.2 

mk 


l^k=l 


(30) 


(31) 


^ ^ + T.k=l(^mk 

Therefore, the parameter veetor p ean be updated using ( |3T] ) whieh is the maximum of the 
Gaussian distribution d^ . Similarly, it ean be shown using ( fT5] ) and ( [T^ that 

N K 

f (A|r , 0 , = n n / Kr,k\Y:k, &k:, d) (32) 


n=lk=l 


and that cr'^rkl^-k: ^k.iU is distributed aeeording to the following inverse-gamma distribution 


cr, 


nr,k 


\Yk,&k ,u ( 2 ’/^ 


with (3 = X]m=(n-i)r+i Thus, the mode of the inverse gamma distribution ( |3^ is 


0-2 - ^ 
'^nk — 


- -f 1 
2 ~ 


(33) 


(34) 
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3) Stopping criteria: Algo. is an iterative algorithm that requires the definition of some 
stopping eriteria. In this paper, we have eonsidered three eriteria and the algorithm is stopped 
if one of them is satisfied. The first eriterion eompares the new value of the eost funetion 
to the previous one and stops the algorithm if the relative error between these two values is 
smaller than a given threshold, i.e., 

|C(*+^)-C(*)| < (35) 


where |. | denotes the absolute value. The seeond eriterion evaluates the new parameter values 
and stops the algorithm if the following eondition is satisfied 


7 


{i+i)_yi)|| <^2(117^*11+6) 


(36) 


The last eriterion is based on a maximum number of iterations T^ax- Note that the proposed 
algorithm ean be used when eonsidering different altimetrie models. Indeed, it just needs 
the definition of the altimetrie model and its derivatives w.r.t. the altimetrie parameters (the 
altimetrie model ean be given by a elosed form expression as for BM, or by a numerieal 


eomputation as for CA and DDA). This model flexibility is highlighted in Seetion VI that 
eonsiders the BM, CA and DDA models for proeessing real data. 

It is also interesting to note that the eoordinate gradient algorithm eonverges if the minimum 


of the eost funetion (24) w.r.t. {Q,p,,A) along eaeh eoordinate is uniquely reaehed [24|. 


This is easily eheeked for the mean veetor p sinee the posterior distribution (22) viewed 


as a funetion of p is proportional to the Gaussian distribution (30) that has a unique mode. 


Similarly, the posterior distribution ( |22| ) viewed as a funetion of is proportional to 


the inverse gamma distribution ( [33] ) that has a unique mode. Considering the altimetrie 
parameters, the eonvexity of the eost funetion eannot be theoretieally demonstrated. However, 
in our simulations obtained with synthetie and real data (see Seetions |V] and [yT] ) the proposed 
algorithm always eonverges to the aetual parameters whieh tends to eonfirm this eonvexity. 
Note finally that other gradient deseent based approaehes have been sueeessfully used in the 
literature to only estimate the altimetrie parameters Q, [15| (without imposing parameter 
smoothness nor estimating the noise statisties). 


V. Validation on synthetic data 

This seetion first introduees the eriteria used to evaluate the performanees of the proposed 
estimation algorithm. The seeond part defines the effeetive number of looks used to exploit 
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the estimated noise eovarianee. The third part of this seetion eompares the performanee of 
the proposed algorithm with those obtained with state-of-the-art approaehes for synthetie data 
with eontrolled ground truth. 


A. Evaluation criteria 


For synthetie eehoes, the true parameters are supposed to be known and ean be used to 
evaluate the estimation performanee of an algorithm. More preeisely, the bias and standard- 
deviation (STD) of the estimator 6*j are defined as follows 


1 


M 


Bias [Oij = \0i{m) - 6'i(m) 


m=l 


(37) 


and 



1 at 

STD {O^ = , 

— ^ \9i{m)-9i{m) 

m=l 


(38) 


for i G {1, ■ ■ ■ , 3}, where 9i{m) (resp. 9i{m) ) is the true (resp. estimated) parameter for the 
mth waveform and M is the number of simulated waveforms. 


B. Effective number of looks 


The proposed algorithm estimates the noise eovarianee A whieh is used to deduee the 
effeetive number of looks (ENL) Q, [16|. This number represents the equivalent number of 
independent eehoes that were averaged to reduee the noise and is defined by 


Nef[{n, k) = 


[y{k)] 


^lrYZ={n-l)r+iym{k) 


2 

^ nr,k 


(39) 


E {im - E {v{k))f] 
for k G {1, • • ■ , and nG {I,-- - ,N}. Note that Aeff(n, k) differs from one temporal gate 
to another and thus depends on /c. In order to avoid this dependeney, the following averaged 
number of looks is eonsidered in this paper 


1 


K 


ENL(n) = A'eff(n, k) 


(40) 


k=l 


for n G {1, • • ■ , iV}. In the ease of synthetie data, this estimation will be eompared to the 
number L of averaged eehoes in terms of bias and standard-deviation. 
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C. Simulation results on synthetic data 

The proposed strategy (denoted by CD for eoordinate deseent) is first investigated when 
eonsidering M = 500 eorrelated altimetrie eehoes generated aeeording to the BM defined 
in Q. The correlation between successive echoes is introduced by considering a smooth 
evolution of the altimetrie parameters. More precisely, we have considered a dynamic evo¬ 
lution of the altimetrie parameters defined as follows SWH(m) = 2.5 + 2cos(0.07m), 
r(m) = 27 + 0.02m if m < 250 and r(m) = 32 — 0.02m if m > 250, and Pn(m) = 
158 + 0.05sin(0.1m), where m denotes the echo number. The synthetic echoes are then 
generated by plugging the values of the altimetrie parameters in the BM and by corrupting 
the resulting echo by a thermal noise firn = 0.025, Vm and a speckle noise resulting from the 
averaging of L = 90 echoes. The proposed CD strategy is compared to the state-of-the-art 
LS algorithm described in Q, Q, pB| and to the Bayesian approach HMC p2| , detailed 
in Appendix Table reports the obtained biases and STDs when estimating the three 
parameters of interest and ENL when it is possible. The considered algorithms are denoted 
by their names (CD, HMC and LS) followed by the considered altimetrie model (BM, CA 
and DDA). For instance, CD-BM represents the CD algorithm when considering BM echoes. 
Table 1^ shows reduced biases for r and SWH when considering the Bayesian CD and HMC 
methods. Indeed, these two algorithms estimate the non-identically distributed noise while the 
LS estimator only estimates the thermal noise and assumes an i.i.d. speckle noise. Thus, the 
Bayesian algorithms involve a more precise model and provide a lower bias for the altimetrie 
parameters (except for Py). Note that all algorithms provide a good estimate of the thermal 
noise n as shown in Table Moreover, from an STD point of view, both Bayesian CD and 
HMC approaches lead to lower STDs than when using an LS-based method since they are 
using the smooth prior defined in ( [T^ . Note that CD-BM improves the LS-BM STDs by a 
factor of 16 for SWH, 5 for r and 3 for Py. Table also highlights the additional interest 
of both Bayesian algorithms that are able to estimate the number of averaged independent 
echoes with a good precision (small bias and STD). Last but not least. Table |I] gives the 
averaged computational times of the three algorithms, for a Matlab implementation on a 
2.4-GHz Intel-i7 PC. As expected, the HMC algorithm is highly computationally expensive, 
due to its sampling procedure, while the proposed CD algorithm is the fastest one, 3 times 
faster than the LS one. Results similar to Table have been found when considering CA and 
DDA models but these results are not provided in this Section for brevity (these models are 
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investigated in the next Seetion when eonsidering Cryosat-2 eehoes). Finally, in the ease of 
synthetie data, the proposed CD approaeh outperforms the elassieal LS one, both in terms of 
bias and STD of the estimated parameters but also in terms of eomputational eost, allowing 
at the same time the estimation of additional noise parameters. Figure illustrates the elear 
improvement of the proposed algorithm when eompared to the LS one, in terms of bias 
and STD. Sub-figures (a-top), (e-top) and (e-top) present the aetual parameter values (blaek 
line) and the estimated ones by eonsidering the LS (red line) and the CD (blue line, almost 
eonfused with the blaek one) algorithms for 500 eehoes. When plotting the differenee between 
the aetual and estimated parameters in sub-figures (a-bottom), (b-bottom) and (e-bottom), the 
interest of the proposed CD strategy is elearly highlighted. 

VI. Results on real data 

In order to evaluate the performanee of the proposed algorithm on real data, we first present 
our eomparison eriteria. Then, two kinds of real data are eonsidered: Jason-2 eehoes for whieh 
BM is applied, and Cryosat-2 for whieh CA and DDA models are more appropriate. 


A. Evaluation criteria 

When proeessing real data, it is not possible to eompare the performanee of the different 


parameter estimation algorithms using the bias and STD defined in ( p7] ) and ( [^ , sinee the 
true parameter values are not known. In the ease of real waveforms, it is usual to eompare 
the different parameter estimations along a pass by plotting the different obtained results [j^, 
p3| and by eonsidering a modified STD. The modified STD is eomputed using (|3^ in whieh 
the true parameter value is approaehed by the mean of the estimated parameters along 20 
sueeessive eehoes. This modified STD is eailed “STD at 20 Hz” [ |^ , [|T7|. 


B. Analysis of Jason-2 data with the Brown model 

This seetion illustrates the performanee of the proposed CD algorithm when applied to a 
real Jason-2 dataset. The eonsidered data last 36 minutes and eonsist of 43000 real eehoes 
that were extraeted from the pass 30 of eyele 35. Figure shows the parameters estimated 
on 700 sueeessive eehoes along the pass when eonsidering the LS-BM (in red), HMC-BM 
(in green) and CD-BM (in blue) algorithms. As observed for synthetie data in Seetion |V| 
the LS-BM estimates present a high estimation noise while both CD and HMC algorithms 
provide smoother estimates whieh are physieally more eonsistent. Moreover, CD and HMC 
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algorithms appear to be more robust to outliers as illustrated for the estimate #890 of Note 
that the estimated SWH is slightly larger for the Bayesian algorithms when eompared to LS. 
This differenee ean be explained by the i.i.d. noise assumption used in LS algorithm whieh is 
not in adequation with the eonsidered data as already diseussed in p3| , [16|. The reduetion 
of estimation varianee provided by the Bayesian algorithms is partieularly notieeable when 
plotting the speetral power density of SWH and r as in Fig. This figure shows a elear noise 
reduetion espeeially for oeean seales smaller than 100 km for SWH and 10 km for r (note 
that we have not applied atmospherie and instrumental eorreetions on the estimated epoeh r) 
whieh improves the observation of these seales. More quantitative results are provided in Table 
1^ This table shows a good agreement between the means of the estimated parameters for 
the LS, HMC and CD algorithms (exeept for SWH as explained above). Bayesian algorithms 
also estimate the ENL whieh provides an indieation about the noise level eorrupting the data. 
The estimated ENL ranges between 70 and 80 eehoes whieh is elose to the aetual value 
(around 90 or 100 eehoes). As for synthetie data, the estimated STDs obtained with the CD 
algorithm are smaller than for LS and HMC whieh is of great importanee for many praetieal 
applieations related to oeeanography sueh as bathymetry. Indeed, the CD algorithm improves 
the LS-BM STDs by a faetor of about 20 for SWH, 2 for r and 3 for P„. Note finally that the 
proposed CD algorithm requires a redueed eomputational time two times lower than that of 
the LS algorithm. As previously explained, this time reduetion is due to the parallel seheme 
eonsidered by the CD algorithm whieh is more effieient than the independent proeessing of 
eehoes performed in the LS algorithm. 


C. Comparison between CA and DDA for Cryosat-2 data 

This seetion evaluates the performanee of the proposed CD algorithm for real oeeanie 
Cryosat-2 waveforms. The Cryosat-2 altimeter ealled SIRAL presents three modes that are: 
the low resolution mode (LRM), the synthetie aperture radar mode (SARM) and the synthetie 


aperture radar interferometrie mode (SARInM) [381. This paper eonsiders data from the 
SARM that are used to generate CA|^ and DDA eehoes (see [[^ for more details about these 
eehoes). The eonsidered dataset lasts approximately 100 seeonds (i.e., 2000 eehoes) and was 


obtained in August 2011 by the Cryosat proeessing prototype (CPP) developed by CNES [411 


®These echoes are known under different names: LRM-like |32| , pseudo-LRM reduced-SAR (RDSAR) 

or CA-SARM in 0- These echoes are used to compare the parameters of CA with those of DDA. 
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(which is doing the level 1 proeessing). Note that the CNES-CPP uses data pre-proeessed 


to full bit rate provided by ESA [381. Table IV eompares the estimated parameters using 
the ES algorithm with the CD and HMC algorithms when eonsidering both CA and DDA 
eehoes. Eor all algorithms, we have eonsidered the same instrumental parameters as in [[^, 
[|7| whieh are given in Table III To eompare with [[^, we only present the estimation results 
of SWH and of the sea surfaee height anomaly (SSHA) whieh represent the most important 
parameters (SSHA is obtained by applying all environmental eorreetions on the estimated 
epoeh r). Eigs.[^ andshow the evolution of the estimated SWH and SSHA parameters when 
eonsidering the three studied algorithms. These figures show a good agreement between the 
estimated parameters espeeially between CD and HMC algorithms, whieh is eonfirmed by 


the mean of the estimated parameters provided in Table IV Note that, as for Jason-2, a bias 
seems to appear between the estimated SWH parameter for Bayesian algorithms and the ES 
approaeh, whieh ean be explained similarly. Considering the STDs, and to eompare the results 
of Table |IV] with Q, one has to note that the results of [|^ were obtained when using the ES 
algorithm for both CA and DDA. The proeessing of the Cryosat-2 data further highlights the 
main advantage of the proposed CD approaeh whieh is to reduee drastieally the STDs of the 
estimated parameters. Eor example, STDs(SWH) (resp. STD(SSHA)) goes from 91 em (resp. 
14 em) for ES-CA to 3 em (resp. 4 em) for CD-CA whieh represents a great improvement. The 
same behavior is observed when eonsidering DDA eehoes showing the interest of smoothing 
the altimetrie parameters. The seeond advantage of the CD algorithm is the reduetion of the 


eomputational time espeeially for DDA eehoes. Indeed, Table IV highlights a time reduetion 
by a faetor 20 w.r.t. ES-DDA that proeessed the eehoes independently while CD-DDA uses 
a parallel seheme. Eor CA eehoes, the CD approaeh shows a time improvement by a faetor 
1.5 whieh is still an interesting result. Note finally that the HMC algorithm shows similar 
performanee than the CD algorithm but at a priee of a higher eomputation eost. These results 
eonfirm the good performanee of the proposed algorithm and show its flexibility w.r.t. the 
eonsidered altimetrie models. 
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VII. Conclusions 


This paper proposed a new Bayesian strategy for a smooth parameter estimation of the 
noise and altimetrie parameters. The altimetrie parameters were assigned a prior distribution 
enforeing a smooth temporal evolution. The altimetrie signal was assumed to be eorrupted 
by a thermal and speekle noise distributed aeeording to an independent and non identieally 
Gaussian distribution. The parameter estimation was aehieved using a eoordinate deseent 
algorithm that presents a low eomputational cost which is suitable for real time applications. 
The proposed CD algorithm showed good performance and improved the quality of the 
estimated parameters when applied to both synthetic and real altimetrie echoes from conven¬ 
tional and delay/Doppler altimetry. More precisely, it provided reduced parameter STDs and 
presented a lower computational cost when compared to the state-of-the-art LS algorithm. 
Moreover, the CD algorithm is generic and showed its efficiency when considering three 
altimetrie models. Future work includes the consideration of other altimetrie models such as 
the DDA-SAMOSA model 


|, coastal models p^ , [ |44B and the CA model based on 
the true PTR [29|, p0| . Considering antenna mispointing is also an interesting issue that is 
currently under investigation. 
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Appendix 

Mathematical derivations 

A. Hybrid Gibbs algorithm 

The principle of the Gibbs sampler is to sample according to the conditional distributions of 


the posterior of interest [45|. The MCMC based algorithm proposed in [22| uses this principle 


to sequentially sample the parameters ©, A and e. When a conditional distribution cannot 
be sampled directly, sampling techniques such as the HMC algorithm can be applied (see 


, [461 for more details about this algorithm). This algorithm has shown better mixing 


property than independent or random walk Metropolis-Hasting moves especially in high¬ 


dimensional problems [23|, [[46[. Therefore, it has been considered in [22| since the vectors 


to sample are of size (M x 1). The Gibbs sampler (including the HMC procedures) used to 
sample according to the posterior ( [20| ) consists of the four steps summarized in Algo. This 


Algorithm 2 Hybrid Gibbs sampler 


1 : Initialization 

2 : Initialize parameters 6 f\ for i G {1, • • • ,3}, A*^°^ and 

3 : Update parameters/hyperparameters 

4: for t =1 Abi -I- Nr do 

5: Sample 6 f \ for i G {1, 2, 3} according to ( |^ using HMC 

6 : Sample according to ( |^ 

7 : Sample A*^*^ according to ( |3^ 

8 : Sample according to ( |^ 

9: end for 


algorithm uses the conditional distribution associated with the altimetric parameters and the 
hyperparameters that are obtained by considering the likelihood ( [T5| ) and the priors ( [T^ and 
G3. leading to 


f{0i\Y, ©\i, A, el) cx exp 



E 

m=l 



(41) 


and 


el\Y,B,^l,l:^xg 


/ M WDOif 

-h Oi, ^ 

^2 ’2 



(42) 


for z G (1, ■ ■ ■ , 3}. Note that Algo. draws Amc = ^bi + samples that are used to 
approximate the MMSE estimator of the parameters after removing the first Abi burn-in 
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samples. This algorithm benefits from the robust properties of the MCMC methods and 
allows the estimation of the hyperparameters. However, it presents a high eomputational eost 
that prevents its use for real time applieations. 
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Fig. 2. Actual (black line) and estimated parameter using the LS algorithm (red line) and the proposed CD-BM algorithm 
(blue line), (a) SWFl, (b) r and (c) Pu - The bottom subfigures show the difference between the estimated and real parameters. 
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Fig. 3. Estimated parameters using the LS-BM algorithm (red line), the HMC-BM algorithm (green line) and the CD-BM 
algorithm (blue line) for Jason-2 echoes, (a) SWH, (b) r and (c) Pu- 
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Fig. 4. Estimated spectra of SWH and t using 43000 Jason-2 echoes with the LS algorithm (red line), the HMC-BM 
algorithm (green line) and the CD-BM algorithm (blue line). 
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Fig. 5. Estimated SWH using 2000 Cryosat-2 echoes with the LS algorithm (red line), the HMC algorithm (green line) 
and the CD algorithm (blue line). (Top) CA echoes, (bottom) DDA echoes. 
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Fig. 6. Estimated SSHA using 2000 Cryosat-2 echoes with the LS algorithm (red line), the HMC algorithm (green line) 
and the CD algorithm (blue line). (Top) CA echoes, (bottom) DDA echoes. 
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TABLE I 

Performance on synthetic BM data (500 echoes). 


29 



SWH 

(cm) 

T 

(cm) 

Pu 


ENL 

Bias 

LS-BM 

3.18 

1.1 

- 0.01 

0.51 

- 

CD-BM 

0.32 

0.08 

-0.2 

0.26 

0.97 

HMC-BM 

- 0.02 

-0.13 

-0.2 

-0.1 

0.93 

STD 

LS-BM 

44.7 

6.1 

1.91 

11 

- 

CD-BM 

2.72 

1.1 

0.62 

12 

4.47 

HMC-BM 

5.74 

1.8 

0.52 

6 

4.46 

Average time 
per echo 
in milliseconds 

LS-BM 

8.9 

CD-BM 

3.6 

HMC-BM 

739.7 


TABLE II 

Performance on real Jason-2 data (43000 echoes). 



SWH (cm) 

r (cm) 

Pu 

M 

ENL 

Mean 

LS-BM 

237 

14.67 

167.81 

0.97 

- 

CD-BM 

272 

14.69 

167.52 

1.06 

74 

HMC-BM 

270 

14.69 

167.53 

1.06 

80 

STD 

LS-BM 

55.5 

11.23 

8.18 

0.12 

- 

CD-BM 

2.37 

5.57 

2.79 

0.08 

15 

HMC-BM 

5.58 

7.11 

7.64 

0.08 

9.5 

Average time 
per echo 
in milliseconds 

LS-BM 

12.5 

CD-BM 

6.6 

HMC-BM 

1142.3 


Parameter 

Value 

Frequency 

Wavelength (A) 

Bandwidth (B) 

Altitude (h) 

Burst repetition frequency (BRF) 
Pulse repetition frequency (PRF) 

3 dB Antenna beam width (fladn) 
Velocity (Vs) 

Pulses per burst 

Burst length (rb) 

Doppler beam width 

13.575 GHz 

2.21 cm 

320 MHz 

730 km 

85 Hz 

18182 Hz 

1.1388 degrees 
7000 m/s 

64 pulses 

3.5 ms 

327 m 


TABLE III 

Instrumental parameters. 
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TABLE IV 

Comparison between the LS, CD and HMC algorithms when considering real CA and DDA data 

(2000 Cryosat-2 echoes). 



CA echoes 

1 DDA echoes 

LS-CA 

CD-CA 

HMC-CA 

LS-DDA 

CD-DDA 

HMC-DDA 

Means 

SWH (m) 

2.39 

2.59 

2.53 

2.31 

2.70 

2.70 

SSHA (m) 

-5.12 

-5.20 

-5.19 

-5.13 

-5.20 

-5.21 

STDs at 20 Hz 

SWH (cm) 

91 

3 

20 

45 

3 

8 

SSHA (cm) 

14 

4 

4 

6 

2 

2 

Average time 
per echo (s) 

0.1 

0.07 

252 

0.79 

0.04 

233 
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